Introduction to Python and Visual Studio Code

The story of Python

Before installing and using Python, let’s first discuss what Python is. Unlike some older programming languages like C or Java, Python was created with simplicity and readability in mind. It was developed by Guido van Rossum in the late 1980s and released in 1991 as an open-source language. Python quickly gained popularity due to its simplicity, flexibility, and extensive libraries for various tasks, including data analysis, web development, automation, and machine learning.

Python’s design philosophy emphasizes readability and ease of use, making it an excellent choice for beginners, data scientists, and researchers alike. Its versatility means you can use it for both simple tasks and complex data analysis projects.

Why Python

Python is one of the best options for data scientists and data analysts because it is free, efficient, specialized in data analysis and machine learning, and can run on most platforms (Windows, Mac, Linux, etc.). Furthermore, the active and large community surrounding Python offers plenty of resources for learning and support. Additionally, it is easy for developers to share packages and libraries, providing Python users with early access to the latest tools and methods in data science from various fields. Lastly, Python makes it easy for developers to share their code, often referred to as a script. A script serves as a comprehensive record of the analysis we have conducted, offering a crucial feature for reproducible work and research.

Installing Python

Now that we got a first impression about what Python is, let’s see how we can start using it. Firstly, we need to install Python in our local computer, a task that is very easy and brief. We can do so by clicking the link below and finding the latest version in the official website. During the installation process, it is advisable to simply select the default options: https://www.python.org/downloads/

After the installation is complete, we can use Python immediately. When we open Python, we can see its console, that looks like this:

Figure 1.1: The Python console.

On the console, we type commands and tap Enter to execute them. For instance, we can type 2+2 or 3 + 1 (leaving spaces between the characters does not make any difference) and then tap Enter to see that the final result in both cases is 4:

Figure 1.2: Example calculations in the Python console.

Each time we tap Enter, the written code on the corresponding line runs. Of course, we will use Python for much more complex tasks than a calculator. Although we could start working with the Python console, it is much more convenient to work with Visual Studio Code.

Installing Visual Studio Code

Visual Studio Code (VS Code) is an integrated development environment (IDE), providing a user-friendly interface and tools to facilitate code writing, data analysis, and visualization. In simpler terms, VS Code is a tool that can help us use Python in a much more convenient and flexible way. To really understand the difference between Python and VS Code, we can imagine that Python is like the engine of a car and VS Code is like the driver’s cockpit, with a user-friendly dashboard and controls which make it easier to operate and get the most out of the car’s engine. Without an engine though, the driver’s cockpit would not be useful at all.

We can install VS Code by clicking the link below and following the instructions (once again, it is highly recommended to stick with the default options): https://code.visualstudio.com/download

When we open VS Code for the first time, we get the following setup:

Figure 1.3: Visual Studio Code interface.

To start a new script, we click on File -> New File -> Python File. Now, VS Code should look like this:

Figure 1.4: Visual Studio Code interface with a script.

We can type in the main area, which is called the Code Editor. For instance, we can write the famous first example "Hello World" and use the function print() to display the result:

Figure 1.5: Typing on the Visual Studio Code Editor.

To execute this code, we have the following options:

  1. Click the Run button at the top right of the editor.
  2. Right-click the code and select Run Python -> Run Python File in Terminal.

Once we do this, VS Code asks us to save the file with the extension .py, which is the standard format for Python scripts, and then executes the code. The path of the saved file is now shown at the top of the editor. Usually, the Terminal at the bottom opens automatically and shows the execution process and the output of the code. In this case, we will see the script being saved and the message Hello World printed in the Terminal:

print("Hello World") 
Hello World

The Terminal is a special interface where we can communicate directly with the operating system and with Python. It is a text-based environment where commands are typed and results are displayed. Although it may look unfamiliar at first, it is simply another way to interact with programs through text instead of buttons and menus.

When we run a Python script in VS Code, the editor sends the file to Python, Python executes the code, and the Terminal shows the results. This creates a clear workflow: we write and save code in the Code Editor, and we see the results in the Terminal.

Python Packages

When we install Python and VS Code, we can start using Python immediately. It already includes many useful functions, such as print(), which displays the text or values we put inside the parentheses on the terminal.

The standard functionality that comes with the installation of Python is usually called standard Python. As we mentioned, Python is a free and open-source programming language, which means there are many developers who have created their own contributions, or packages (and we can create our own as well!). In Python, packages are collections of specialized tools and functions that extend its capabilities, allowing the researcher or analyst to perform a wide range of tasks, from data analysis and visualization to implementing specialized statistical techniques, machine learning, and more.

Python comes with a standard library of packages, such as math for mathematical functions. However, many useful packages are not included by default and must be installed separately. One of the packages that we will use throughout this book is pandas. To install this package and, by extension, any other package, we can use the pip package manager in the Terminal.

The pip command

The pip command is Python’s built-in tool for downloading, installing, and managing packages from the Python Package Index (PyPI). When we run a command such as pip install pandas, pip finds the package online, downloads it, and installs it so we can use it in our code. We can also use pip to update packages or see which packages are already installed.

# Installing the pandas package
pip install pandas

After running this command, Python downloads and installs the package on our computer. However, we cannot start using the package immediately in our script. Installing means the package exists on our computer, but we still need to import it for use:

# Importing the pandas package
import pandas as pd

In practice, we usually import packages using a short alias. In the code above, we import the pandas package with the alias pd. This is a common convention in Python, as it makes the code shorter, cleaner, and easier to write.

It is also important to distinguish between a package and a module. A package is a collection of related tools and functionalities that we install and use in Python, such as pandas. A module, on the other hand, is a single component within a package that contains specific functions, classes, or variables. In simple terms, a package can be thought of as a complete toolbox, while a module is one individual tool inside that toolbox.

Regarding pandas specifically, we will see in later chapters what exactly it can do and why we need it. For now, it is sufficient to understand that a package is a set of tools that extends Python’s functionality.

We can also see which packages are installed on our system by running the following command in the Terminal:

# Displaying all installed Python packages in the environment
pip list

Lastly, by using the functions we mentioned, we observed another very useful feature of VS Code: when we type the first letters of a function, such as pr, VS Code automatically gives us possible options that we can choose with our mouse (or keyboard arrows):

Figure 1.6: Using VS Code’s autocomplete feature.

In this way, even if we do not quite remember the full name of a function, we can benefit from this useful VS Code feature.

The Jupyter Option

In addition to running Python scripts in a traditional .py file inside VS Code, there is another very popular and highly useful option called Jupyter. This refers to Jupyter Notebooks, an interactive computing environment that allows us to write and execute Python code in small, separated sections called cells.

A Jupyter Notebook (often saved with the .ipynb extension) works differently from a standard Python script. Instead of running an entire file at once, we can run code cell by cell, seeing the output immediately below each section of code. This makes it especially useful for data analysis, visualization, and step-by-step experimentation, where we want to explore results gradually rather than execute everything at once.

Each notebook is connected to a kernel, which is the “engine” that runs the Python code. When we execute a cell, the kernel processes the code and returns the result directly underneath it. This interactive workflow makes it easy to test ideas, debug code, and document our thought process in a clear and structured way.

The Jupyter environment also allows us to mix code, text, and visual output in a single document. For example, we can write explanations in Markdown and run Python code in individual cells. Each cell can contain either code or text, and can be executed independently, which makes it easy to build up an analysis step by step. Code can be added, edited, and re-run in separate chunks, allowing us to quickly test ideas, modify parts of the workflow, and immediately see how changes affect the output, which is typically displayed directly below the corresponding code cell after execution.

For example, the image below illustrates how text precedes the Python code, with the output printed directly below it.

Figure 1.7: Example of a Jupyter Notebook including text, code, and output.

Displaying Output in Jupyter Notebooks

In Jupyter Notebooks, it is not necessary to use print() to display the output of the last line in a cell. If an expression is placed at the end of a cell, its result is automatically shown below the cell once it is executed. The print() function is still useful when we want to display multiple outputs or format text more explicitly, but for simple expressions it can often be omitted.

Jupyter Notebook is therefore not just an alternative way of writing Python code, but a more interactive and exploratory workflow. While traditional scripts are ideal for structured programs and reproducible pipelines, Jupyter Notebooks are particularly powerful for analysis, experimentation, and learning, where understanding each step is just as important as the final result.